StixelNet: A Deep Convolutional Network for Obstacle Detection and Road Segmentation
نویسندگان
چکیده
Obstacle detection is a fundamental technological enabler for autonomous driving and vehicle active safety applications. While dense laser scanners are best suitable for the task (e.g. Google’s self driving car), camera-based systems, which are significantly less expensive, continue to improve. Stereo-based commercial solutions such as Daimler’s “intelligent drive” are good at general obstacle detection while monocular-based systems such as Mobileye’s are usually designed to detect specific categories of objects (cars, pedestrians, etc.). The problem of general obstacle detection remains a difficult task for monocular camera based systems. Such systems have clear advantages over stereo-based ones in terms of cost and packaging size. Another related task commonly performed by camera-based systems is scene labeling, in which a label (e.g. road, car, sidewalk) is assigned to each pixel in the image. As a result full detection and segmentation of all the obstacles and of the road is obtained, but scene labeling is generally a difficult task. Instead, we propose in this paper to solve a more constrained task: detecting in each image column the image contact point (pixel) between the closest obstacle and the ground as depicted in Figure 1(Left). The idea is borrowed from the “Stixel-World” obstacle representation [1] in which the obstacle in each column is represented by a so called “Stixel”, and our goal is to find the bottom pixel of each such “Stixel”. Note that since we don’t consider each non-road object (e.g. sidewalk, grass) as an obstacle, the task of road segmentation is different from obstacle detection. Notice also that free-space detection task is ambiguously used in the literature to describe the above mentioned obstacle detection task [1] and the road segmentation task [4]. Current “Stixel-based” methods [1] for general obstacle detection use stereo vision while our method is monocular-based. A different approach for monocular based obstacle detection relies the host vehicle motion and uses Structure-from-Motion (SfM) from sequences of frames in the video [3]. In contrast our method uses a single image as input and therefore operates also when the host vehicle is stationary. In addition, the SfM approach is orthogonal to ours and can therefore be later combined to improve performance. For the task of road segmentation, the common approach is to perform pixel or patch level [4]. In contrast, we propose to solve the problem using the same column-based regression approach as for obstacle detection. Our approach is novel in providing a unified framework for both the obstacle detection and road-segmentation tasks, and in using the first to facilitate the second in the training phase. We propose solving the obstacle detection task using a two stage approach. In the first stage we divide the image into columns and solve the detection as a regression problem using a convolutional neural network, which we call “StixelNet”. Figure 1(Right) shows an example network input and output. In the second stage we improve the results using interactions between neighboring columns by imposing smoothness constrains via a Conditional Random Field (CRF) over consecutive columns. The flowchart of the obstacle detection algorithm is presented in Figure 2(top). To train the network we introduce a new loss function based on a semi-discrete representation of the obstacle position probability. In this approach we model the probability P(y) of the obstacle position as a piecewise-linear probability distribution. The road segmentation is done is three stages. The first two, StixelNet (trained on the road segmentation task) followed by a CRF, are the same as in obstacle detection. The final stage performs a graph-cut segmentation on the image to achieve higher accuracy by enforcing road boundaries to coincide with image contours. The flowchart of the road segmentation algorithm is presented in Figure 2(bottom). It is well known that having large quantities of labeled data is crucial for training deep CNNs. A major advantage of our unique task formulation is the ability to use laser-scanners, which are excellent at the given Figure 1: (Left) Obstacle detection example. (Right) Input to StixelNet and output example
منابع مشابه
A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملAn efficient method for cloud detection based on the feature-level fusion of Landsat-8 OLI spectral bands in deep convolutional neural network
Cloud segmentation is a critical pre-processing step for any multi-spectral satellite image application. In particular, disaster-related applications e.g., flood monitoring or rapid damage mapping, which are highly time and data-critical, require methods that produce accurate cloud masks in a short time while being able to adapt to large variations in the target domain (induced by atmospheric c...
متن کاملNon-melanoma skin cancer diagnosis with a convolutional neural network
Background: The most common types of non-melanoma skin cancer are basal cell carcinoma (BCC), and squamous cell carcinoma (SCC). AKIEC -Actinic keratoses (Solar keratoses) and intraepithelial carcinoma (Bowen’s disease)- are common non-invasive precursors of SCC, which may progress to invasive SCC, if left untreated. Due to the importance of early detection in cancer treatment, this study aimed...
متن کاملMelanoma detection with a deep learning model
Background: Skin cancer is one of the most common forms of cancer in the world and melanoma is the deadliest type of skin cancer. Both melanoma and melanocytic nevi begin in melanocytes (cells that produce melanin). However, melanocytic nevi are benign whereas melanoma is malignant. This work proposes a deep learning model for classification of these two lesions. Methods: In this analytic s...
متن کاملReal-Time Road Segmentation Using LiDAR Data Processing on an FPGA
This paper presents the FPGA design of a convolutional neural network (CNN) based road segmentation algorithm for real-time processing of LiDAR data. For autonomous vehicles, it is important to perform road segmentation and obstacle detection such that the drivable region can be identified for path planning. Traditional road segmentation algorithms are mainly based on image data from cameras, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015